NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A survey on deep learning for drug-target binding prediction: models, benchmarks, evaluation, and case studies

https://doi.org/10.1093/bib/bbaf491

Debnath, Kusal; Rana, Pratip; Ghosh, Preetam (August 2025, Briefings in Bioinformatics)

Abstract Conventional drug discovery is expensive, time-consuming, and prone to failure. Artificial intelligence has become a potent substitute over the last decade, providing strong answers to challenging biological issues in this field. Among these difficulties, drug-target binding (DTB) is a key component of drug discovery techniques. In this context, drug-target affinity and drug–target interaction are complementary and essential frameworks that work together to improve our comprehension of DTB dynamics. In this work, we thoroughly analyze the most recent deep learning models, popular benchmark datasets, and assessment metrics for DTB prediction. We look at the paradigm shift in the development of drug discovery research since researchers started using deep learning as a potent tool for DTB prediction. In particular, we examine how methodologies have evolved, starting with early heterogeneous network-based approaches, progressing to graph-based approaches that were widely accepted, followed by modern attention-based architectures, and finally, the most recent multimodal approaches. We also provide case studies utilizing an extensive compound library against specific protein targets implicated in critical cancer pathways to demonstrate the usefulness of these approaches. In addition to summarizing the latest developments in DTB prediction models, this review also identifies their drawbacks. It also highlights the outlook for the DTB prediction domain and future research directions. Combined, these studies present a more comprehensive view of how deep learning offers a quantitative framework for researching drug-target relationships, speeding up the identification of new drug candidates and making it easier to identify possible DTBs.
more » « less
Free, publicly-accessible full text available August 31, 2026
GramSeq-DTA: A Grammar-Based Drug–Target Affinity Prediction Approach Fusing Gene Expression Information

https://doi.org/10.3390/biom15030405

Debnath, Kusal; Rana, Pratip; Ghosh, Preetam (March 2025, Biomolecules)

Drug–target affinity (DTA) prediction is a critical aspect of drug discovery. The meaningful representation of drugs and targets is crucial for accurate prediction. Using 1D string-based representations for drugs and targets is a common approach that has demonstrated good results in drug–target affinity prediction. However, these approach lacks information on the relative position of the atoms and bonds. To address this limitation, graph-based representations have been used to some extent. However, solely considering the structural aspect of drugs and targets may be insufficient for accurate DTA prediction. Integrating the functional aspect of these drugs at the genetic level can enhance the prediction capability of the models. To fill this gap, we propose GramSeq-DTA, which integrates chemical perturbation information with the structural information of drugs and targets. We applied a Grammar Variational Autoencoder (GVAE) for drug feature extraction and utilized two different approaches for protein feature extraction as follows: a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). The chemical perturbation data are obtained from the L1000 project, which provides information on the up-regulation and down-regulation of genes caused by selected drugs. This chemical perturbation information is processed, and a compact dataset is prepared, serving as the functional feature set of the drugs. By integrating the drug, gene, and target features in the model, our approach outperforms the current state-of-the-art DTA prediction models when validated on widely used DTA datasets (BindingDB, Davis, and KIBA). This work provides a novel and practical approach to DTA prediction by merging the structural and functional aspects of biological entities, and it encourages further research in multi-modal DTA prediction.
more » « less
Free, publicly-accessible full text available March 1, 2026
Heterogeneous Clustering of Multiomics Data for Breast Cancer Subgroup Classification and Detection

https://doi.org/10.3390/ijms26041707

Pateras, Joseph; Lodi, Musaddiq; Rana, Pratip; Ghosh, Preetam (February 2025, International Journal of Molecular Sciences)

The rapid growth of diverse -omics datasets has made multiomics data integration crucial in cancer research. This study adapts the expectation–maximization routine for the joint latent variable modeling of multiomics patient profiles. By combining this approach with traditional biological feature selection methods, this study optimizes latent distribution, enabling efficient patient clustering from well-studied cancer types with reduced computational expense. The proposed optimization subroutines enhance survival analysis and improve runtime performance. This article presents a framework for distinguishing cancer subtypes and identifying potential biomarkers for breast cancer. Key insights into individual subtype expression and function were obtained through differentially expressed gene analysis and pathway enrichment for BRCA patients. The analysis compared 302 tumor samples to 113 normal samples across 60,660 genes. The highly upregulated gene COL10A1, promoting breast cancer progression and poor prognosis, and the consistently downregulated gene CDG300LG, linked to brain metastatic cancer, were identified. Pathway enrichment analysis revealed similarities in cellular matrix organization pathways across subtypes, with notable differences in functions like cell proliferation regulation and endocytosis by host cells. GO Semantic Similarity analysis quantified gene relationships in each subtype, identifying potential biomarkers like MATN2, similar to COL10A1. These insights suggest deeper relationships within clusters and highlight personalized treatment potential based on subtypes.
more » « less
Free, publicly-accessible full text available February 1, 2026
Neuron enriched extracellular vesicles’ MicroRNA expression profiles as a marker of early life alcohol consumption

https://doi.org/10.1038/s41398-024-02874-3

Yakovlev, Vasily; Lapato, Dana M; Rana, Pratip; Ghosh, Preetam; Frye, Rebekah; Roberson-Nay, Roxann (December 2024, Translational Psychiatry)

Abstract Alcohol consumption may impact and shape brain development through perturbed biological pathways and impaired molecular functions. We investigated the relationship between alcohol consumption rates and neuron-enriched extracellular vesicles’ (EVs’) microRNA (miRNA) expression to better understand the impact of alcohol use on early life brain biology. Neuron-enriched EVs’ miRNA expression was measured from plasma samples collected from young people using a commercially available microarray platform while alcohol consumption was measured using the Alcohol Use Disorders Identification Test. Linear regression and network analyses were used to identify significantly differentially expressed miRNAs and to characterize the implicated biological pathways, respectively. Compared to alcohol naïve controls, young people reporting high alcohol consumption exhibited significantly higher expression of three neuron-enriched EVs’ miRNAs including miR-30a-5p, miR-194-5p, and miR-339-3p, although only miR-30a-5p and miR-194-5p survived multiple test correction. The miRNA-miRNA interaction network inferred by a network inference algorithm did not detect any differentially expressed miRNAs with a high cutoff on edge scores. However, when the cutoff of the algorithm was reduced, five miRNAs were identified as interacting with miR-194-5p and miR-30a-5p. These seven miRNAs were associated with 25 biological functions; miR-194-5p was the most highly connected node and was highly correlated with the other miRNAs in this cluster. Our observed association between neuron-enriched EVs’ miRNAs and alcohol consumption concurs with results from experimental animal models of alcohol use and suggests that high rates of alcohol consumption during the adolescent/young adult years may impact brain functioning and development by modulating miRNA expression.
more » « less
Full Text Available
Global fitting and parameter identifiability for amyloid-β aggregation with competing pathways

https://doi.org/10.1109/BIBE50027.2020.00020

Rana, Pratip; Bose, Priyankar; Vaidya, Ashwin; Rangachari, Vijay; Ghosh, Preetam (October 2020, 2020 IEEE 20th International Conference on BioInformatics and BioEngineering (BIBE))
null (Ed.)
Full Text Available
Recent advances on constraint-based models by integrating machine learning

https://doi.org/10.1016/j.copbio.2019.11.007

Rana, Pratip; Berry, Carter; Ghosh, Preetam; Fong, Stephen S (August 2020, Current Opinion in Biotechnology)
null (Ed.)
Full Text Available
A game-theoretic approach to deciphering the dynamics of amyloid- β aggregation along competing pathways

https://doi.org/10.1098/rsos.191814

Ghosh, Preetam; Rana, Pratip; Rangachari, Vijayaraghavan; Saha, Jhinuk; Steen, Edward; Vaidya, Ashwin (April 2020, Royal Society Open Science)

Full Text Available
Evaluation of the Common Molecular Basis in Alzheimer’s and Parkinson’s Diseases

https://doi.org/10.3390/ijms20153730

Rana, Pratip; Franco, Edian F.; Rao, Yug; Syed, Khajamoinuddin; Barh, Debmalya; Azevedo, Vasco; Ramos, Rommel T.; Ghosh, Preetam (August 2019, International Journal of Molecular Sciences)

Alzheimer’s disease (AD) and Parkinson’s disease (PD) are the most common neurodegenerative disorders related to aging. Though several risk factors are shared between these two diseases, the exact relationship between them is still unknown. In this paper, we analyzed how these two diseases relate to each other from the genomic, epigenomic, and transcriptomic viewpoints. Using an extensive literature mining, we first accumulated the list of genes from major genome-wide association (GWAS) studies. Based on these GWAS studies, we observed that only one gene (HLA-DRB5) was shared between AD and PD. A subsequent literature search identified a few other genes involved in these two diseases, among which SIRT1 seemed to be the most prominent one. While we listed all the miRNAs that have been previously reported for AD and PD separately, we found only 15 different miRNAs that were reported in both diseases. In order to get better insights, we predicted the gene co-expression network for both AD and PD using network analysis algorithms applied to two GEO datasets. The network analysis revealed six clusters of genes related to AD and four clusters of genes related to PD; however, there was very low functional similarity between these clusters, pointing to insignificant similarity between AD and PD even at the level of affected biological processes. Finally, we postulated the putative epigenetic regulator modules that are common to AD and PD.
more » « less
Full Text Available
Cause and consequence of Aβ – Lipid interactions in Alzheimer disease pathogenesis

https://doi.org/10.1016/j.bbamem.2018.03.004

Rangachari, Vijayaraghavan; Dean, Dexter N.; Rana, Pratip; Vaidya, Ashwin; Ghosh, Preetam (September 2018, Biochimica et Biophysica Acta (BBA) - Biomembranes)

Full Text Available
Benchmarking the communication fidelity of biomolecular signaling cascades featuring pseudo-one-dimensional transport

https://doi.org/10.1063/1.5027508

Rana, Pratip; Pilkiewicz, Kevin R.; Mayo, Michael L.; Ghosh, Preetam (May 2018, AIP Advances)

Full Text Available

Search for: All records